{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`IntArray` is a data structure to store and manipulate an array of unsigned integers using as small space as possible.\n",
    "Each element in the `IntArray{w}` type is packed using `w` bits. So you can store integers between `0x00` and `0x03` with 2 bits per element while `Array{UInt8}` requires 8 bits: the required space is one-quarter!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Types"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "using IntArrays"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Only one type is exported from this package: `IntArray`.\n",
    "The `IntArray` type takes three type parameters:\n",
    "\n",
    "* `w`: the **w**idth (or bits) of integers\n",
    "* `T`: the element **t**ype of an array\n",
    "* `n`: the **n**umber of dimensions of an array.\n",
    "\n",
    "For example, `IntArray{2,UInt8,1}` is a one-dimensional array (or vector) of `UInt8` integers packed in `2` bits.\n",
    "\n",
    "Like the `Vector{T}` and `Matrix{T}` types in the standard library, `IntVector{w,T}` and `IntMatrix{w,T}` are type aliases of `IntArray{w,T,1}` and `IntArray{w,T,2}`, respectively."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "srand(123456);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Constructions"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can construct an `IntArray` object from an existing array with given width:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "8-element Array{UInt8,1}:\n",
       " 0x03\n",
       " 0x00\n",
       " 0x01\n",
       " 0x02\n",
       " 0x03\n",
       " 0x03\n",
       " 0x01\n",
       " 0x03"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# construct an array\n",
    "vec = rand(0x00:0x03, 8)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "8-element IntArrays.IntArray{2,UInt8,1}:\n",
       " 0x03\n",
       " 0x00\n",
       " 0x01\n",
       " 0x02\n",
       " 0x03\n",
       " 0x03\n",
       " 0x01\n",
       " 0x03"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# convert it to IntVector with 2-bit packing\n",
    "ivec = IntArray{2}(vec)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The size is one-quarter of the original array:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "8"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sizeof(vec)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sizeof(ivec)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can allocate an `IntArray` object with given width, element type, and size:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4-element IntArrays.IntArray{4,UInt8,1}:\n",
       " 0x00\n",
       " 0x00\n",
       " 0x04\n",
       " 0x0f"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# allocate a vector object\n",
    "ivec = IntArray{4,UInt8}(4)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4x5 IntArrays.IntArray{9,UInt16,2}:\n",
       " 0x00d0  0x0000  0x00a5  0x0000  0x0174\n",
       " 0x0096  0x0000  0x0143  0x0000  0x0083\n",
       " 0x0067  0x0000  0x0042  0x0000  0x0000\n",
       " 0x0021  0x0160  0x0000  0x0041  0x0000"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# allocate a matrix object\n",
    "imat = IntArray{9,UInt16}(4, 5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The elements of the allocated matrix are uninitialized, so you should fill them if necessary:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4x5 IntArrays.IntArray{9,UInt16,2}:\n",
       " 0x0000  0x0000  0x0000  0x0000  0x0000\n",
       " 0x0000  0x0000  0x0000  0x0000  0x0000\n",
       " 0x0000  0x0000  0x0000  0x0000  0x0000\n",
       " 0x0000  0x0000  0x0000  0x0000  0x0000"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fill!(imat, 0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can also use the `IntVector` and `IntMatrix` type aliases to construct a one-/two-dimensional array:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "6-element IntArrays.IntArray{2,UInt8,1}:\n",
       " 0x00\n",
       " 0x00\n",
       " 0x02\n",
       " 0x03\n",
       " 0x02\n",
       " 0x01"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ivec = IntVector{2,UInt8}(6)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3x4 IntArrays.IntArray{2,UInt8,2}:\n",
       " 0x00  0x01  0x00  0x00\n",
       " 0x00  0x01  0x00  0x03\n",
       " 0x01  0x01  0x02  0x01"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "imat = IntMatrix{2,UInt8}(3, 4)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Operations"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `IntArray{w,T,n}` type inherits from the `AbstractArray{T,n}` type.\n",
    "That means an `IntArray` object behaves like a common array in many ways.\n",
    "Let's create an `IntVector` object with 2-bit encoding and see available operations on it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4-element IntArrays.IntArray{2,UInt8,1}:\n",
       " 0x00\n",
       " 0x01\n",
       " 0x02\n",
       " 0x03"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ivec = IntVector{2}([0x00, 0x01, 0x02, 0x03])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`IntArray` supports random access like normal arrays:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0x01"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ivec[2]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4-element IntArrays.IntArray{2,UInt8,1}:\n",
       " 0x00\n",
       " 0x03\n",
       " 0x02\n",
       " 0x03"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ivec[2] = 0x03\n",
    "ivec"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4-element IntArrays.IntArray{2,UInt8,1}:\n",
       " 0x00\n",
       " 0x01\n",
       " 0x02\n",
       " 0x03"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ivec[2] = 1\n",
    "ivec"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Please keep in mind that elements are truncated to `w` bits without any warnings:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4-element IntArrays.IntArray{2,UInt8,1}:\n",
       " 0x01\n",
       " 0x01\n",
       " 0x02\n",
       " 0x03"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 0x05 cannot be encoded in 2 bits!\n",
    "ivec[1] = 0x05\n",
    "# in this case, 0x05 is truncated to 0x01 because 0x05 && 0b11 = 0x01\n",
    "ivec"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`IntVector` behaves like a `Vector`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1-element IntArrays.IntArray{2,UInt8,1}:\n",
       " 0x00"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ivec = IntVector{2}([0x00])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2-element IntArrays.IntArray{2,UInt8,1}:\n",
       " 0x00\n",
       " 0x01"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "push!(ivec, 0x01)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5-element IntArrays.IntArray{2,UInt8,1}:\n",
       " 0x00\n",
       " 0x01\n",
       " 0x02\n",
       " 0x03\n",
       " 0x00"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "append!(ivec, [0x02, 0x03, 0x00])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0x00"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pop!(ivec)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4-element IntArrays.IntArray{2,UInt8,1}:\n",
       " 0x03\n",
       " 0x02\n",
       " 0x01\n",
       " 0x00"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "reverse!(ivec)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(0x00,0x03)"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "minimum(ivec), maximum(ivec)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(0x00,0x03)"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "extrema(ivec)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "findnext(ivec, 2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(0x00,4)"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "findmin(ivec)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Performance"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Packing and unpacking integers require several CPU clocks. Therefore, in general, random access operations over `IntArray` objects are slower:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "vec = rand(0x00:0x0f, 100000)\n",
    "ivec = IntVector{4}(vec);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  0.003387 seconds (8 allocations: 98.016 KB)\n"
     ]
    }
   ],
   "source": [
    "sort(vec)\n",
    "@time sort(vec);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  0.012687 seconds (18 allocations: 49.672 KB)\n"
     ]
    }
   ],
   "source": [
    "sort(ivec);\n",
    "@time sort(ivec);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "vec = rand(0x00:0x03, 10000000)\n",
    "ivec = IntVector{2}(vec);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  0.315935 seconds (8 allocations: 9.537 MB)\n"
     ]
    }
   ],
   "source": [
    "sort(vec)\n",
    "@time sort(vec);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  1.392724 seconds (18 allocations: 2.385 MB)\n"
     ]
    }
   ],
   "source": [
    "sort(ivec)\n",
    "@time sort(ivec);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "But this performance gap may diminish when a specialized algorithm can be applied.\n",
    "For example, radix sort runs faster when the bit width is small enough:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  0.216997 seconds (16 allocations: 2.385 MB)\n"
     ]
    }
   ],
   "source": [
    "radixsort(ivec)\n",
    "@time radixsort(ivec);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Julia Version 0.4.0-dev+6759\n",
      "Commit c4c9010* (2015-08-16 03:39 UTC)\n",
      "Platform Info:\n",
      "  System: Darwin (x86_64-apple-darwin14.4.0)\n",
      "  CPU: Intel(R) Core(TM) i5-4288U CPU @ 2.60GHz\n",
      "  WORD_SIZE: 64\n",
      "  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)\n",
      "  LAPACK: libopenblas\n",
      "  LIBM: libopenlibm\n",
      "  LLVM: libLLVM-3.3\n"
     ]
    }
   ],
   "source": [
    "versioninfo()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 0.4.0-dev",
   "language": "julia",
   "name": "julia-0.4"
  },
  "language_info": {
   "name": "julia",
   "version": "0.4.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
